AITopics | new image

Collaborating Authors

new image

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

James Webb and Hubble Space Telescopes snap images of same nebula, 10 years apart

The two images of Westerlund 2 show just how far the technology has come. Astronomers are studying the hundreds of young, brown dwarf stars inside the stellar nursery. Breakthroughs, discoveries, and DIY tips sent every weekday. In 2015, NASA celebrated the Hubble Space Telescope's 25th year in orbit by releasing one of its most stunning images to date--a colorful star cluster in the constellation Carina known as Westerlund 2 . However, a lot can change in a decade.

andrew paul, hubble space telescope snap image, westerlund 2, (8 more...)

Popular Science

Country: North America > United States (0.39)

Industry:

Government > Space Agency (0.39)
Government > Regional Government > North America Government > United States Government (0.39)

Technology: Information Technology > Artificial Intelligence (0.37)

Add feedback

Red Spider Nebula glows in ethereal new JWST image

This new James Webb Space Telescope image features a cosmic creepy-crawly called NGC 6537-the Red Spider Nebula. Using its Near-InfraRed Camera (NIRCam), JWST has revealed never-before-seen details in this picturesque planetary nebula with a rich backdrop of thousands of stars. Breakthroughs, discoveries, and DIY tips sent every weekday. A cosmic spider was caught in some kind of web. The telescope's sophisticated Near-InfraRed Camera (NIRCam) revealed some never-before-seen details of NGC 6537, aka the Red Spider Nebula.

central star, jwst image, laura baisa, (8 more...)

Popular Science

Country: North America > United States (0.16)

Genre: Research Report > New Finding (0.36)

Industry: Government (0.53)

Technology: Information Technology > Artificial Intelligence (0.50)

Add feedback

Textual interpretation of transient image classifications from large language models

Stoppa, Fiorenzo, Bulmus, Turan, Bloemen, Steven, Smartt, Stephen J., Groot, Paul J., Vreeswijk, Paul, Smith, Ken W.

arXiv.org Artificial IntelligenceOct-9-2025

Modern astronomical surveys deliver immense volumes of transient detections, yet distinguishing real astrophysical signals (for example, explosive events) from bogus imaging artefacts remains a challenge. Convolutional neural networks are effectively used for real versus bogus classification; however, their reliance on opaque latent representations hinders interpretability. Here we show that large language models (LLMs) can approach the performance level of a convolutional neural network on three optical transient survey datasets (Pan-STARRS, MeerLICHT and ATLAS) while simultaneously producing direct, human-readable descriptions for every candidate. Using only 15 examples and concise instructions, Google's LLM, Gemini, achieves a 93% average accuracy across datasets that span a range of resolution and pixel scales. We also show that a second LLM can assess the coherence of the output of the first model, enabling iterative refinement by identifying problematic cases. This framework allows users to define the desired classification behaviour through natural language and examples, bypassing traditional training pipelines. Furthermore, by generating textual descriptions of observed features, LLMs enable users to query classifications as if navigating an annotated catalogue, rather than deciphering abstract latent spaces. As next-generation telescopes and surveys further increase the amount of data available, LLM-based classification could help bridge the gap between automated detection and transparent, human-level understanding.

classification, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1038/s41550-025-02670-z

2510.06931

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > Netherlands > North Holland > Amsterdam (0.04)
Africa > South Africa > Western Cape > Cape Town (0.04)
Europe > Netherlands > Gelderland > Nijmegen (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

MathCoder-VL: Bridging Vision and Code for Enhanced Multimodal Mathematical Reasoning

Wang, Ke, Pan, Junting, Wei, Linda, Zhou, Aojun, Shi, Weikang, Lu, Zimu, Xiao, Han, Yang, Yunqiao, Ren, Houxing, Zhan, Mingjie, Li, Hongsheng

arXiv.org Artificial IntelligenceMay-16-2025

Natural language image-caption datasets, widely used for training Large Multimodal Models, mainly focus on natural scenarios and overlook the intricate details of mathematical figures that are critical for problem-solving, hindering the advancement of current LMMs in multimodal mathematical reasoning. To this end, we propose leveraging code as supervision for cross-modal alignment, since code inherently encodes all information needed to generate corresponding figures, establishing a precise connection between the two modalities. Specifically, we co-develop our image-to-code model and dataset with model-in-the-loop approach, resulting in an image-to-code model, FigCodifier and ImgCode-8.6M dataset, the largest image-code dataset to date. Furthermore, we utilize FigCodifier to synthesize novel mathematical figures and then construct MM-MathInstruct-3M, a high-quality multimodal math instruction fine-tuning dataset. Finally, we present MathCoder-VL, trained with ImgCode-8.6M for cross-modal alignment and subsequently fine-tuned on MM-MathInstruct-3M for multimodal math problem solving. Our model achieves a new open-source SOTA across all six metrics. Notably, it surpasses GPT-4o and Claude 3.5 Sonnet in the geometry problem-solving subset of MathVista, achieving improvements of 8.9% and 9.2%. The dataset and models will be released at https://github.com/mathllm/MathCoder.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2505.10557

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > Monaco (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report (0.82)

Industry: Education > Curriculum > Subject-Specific Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.90)

Add feedback

Apple will use its street view Maps photos to train AI models

EngadgetMar-25-2025, 15:09:20 GMT

Apple plans to start using images it collects for Maps to train its AI models. In a disclosure spotted by 9to5Mac, the company said starting this month it would use images it captures to provide its Look Around feature for the additional purpose of training some of its generative AI models. Look Around is Apple's answer to Google Street View. The company originally released the feature alongside its 2019 revamp of Apple Maps. The tool allows users to see locations from ground level.

ai model, artificial intelligence, machine learning, (8 more...)

Engadget

Genre: Press Release (0.39)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.39)

Add feedback

Conquering images and the basis of transformative action

Priniski, Hunter

arXiv.org Artificial IntelligenceJul-15-2024

Our rapid immersion into online life has made us all ill. Through the generation, personalization, and dissemination of enchanting imagery, artificial technologies commodify the minds and hearts of the masses with nauseating precision and scale. Online networks, artificial intelligence (AI), social media, and digital news feeds fine-tune our beliefs and pursuits by establishing narratives that subdivide and polarize our communities and identities. Meanwhile those commanding these technologies conquer the final frontiers of our interior lives, social relations, earth, and cosmos. In the Attention Economy, our agency is restricted and our vitality is depleted for their narcissistic pursuits and pleasures. Generative AI empowers the forces that homogenize and eradicate life, not through some stupid "singularity" event, but through devaluing human creativity, labor, and social life. Using a fractured lens, we will examine how narratives and networks influence us on mental, social, and algorithmic levels. We will discuss how atomizing imagery -- ideals and pursuits that alienate, rather than invigorate the individual -- hijack people's agency to sustain the forces that destroy them. We will discover how empires build digital networks that optimize society and embolden narcissists to enforce social binaries that perpetuate the ceaseless expansion of consumption, exploitation, and hierarchy. Structural hierarchy in the world is reified through hierarchy in our beliefs and thinking. Only by seeing images as images and appreciating the similarity shared by opposing narratives can we facilitate transformative action and break away from the militaristic systems plaguing our lives.

category, narrative, vitality, (15 more...)

arXiv.org Artificial Intelligence

2407.11254

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.04)
North America > United States > Nebraska (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (0.40)

Industry:

Media (1.00)
Government (1.00)
Information Technology (0.94)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.47)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Add feedback

CogCoM: Train Large Vision-Language Models Diving into Details through Chain of Manipulations

Qi, Ji, Ding, Ming, Wang, Weihan, Bai, Yushi, Lv, Qingsong, Hong, Wenyi, Xu, Bin, Hou, Lei, Li, Juanzi, Dong, Yuxiao, Tang, Jie

arXiv.org Artificial IntelligenceFeb-6-2024

Vision-Language Models (VLMs) have demonstrated their widespread viability thanks to extensive training in aligning visual instructions to answers. However, this conclusive alignment leads models to ignore critical visual reasoning, and further result in failures on meticulous visual problems and unfaithful responses. In this paper, we propose Chain of Manipulations, a mechanism that enables VLMs to solve problems with a series of manipulations, where each manipulation refers to an operation on the visual input, either from intrinsic abilities (e.g., grounding) acquired through prior training or from imitating human-like behaviors (e.g., zoom in). This mechanism encourages VLMs to generate faithful responses with evidential visual reasoning, and permits users to trace error causes in the interpretable paths. We thus train CogCoM, a general 17B VLM with a memory-based compatible architecture endowed this reasoning mechanism. Experiments show that our model achieves the state-of-the-art performance across 8 benchmarks from 3 categories, and a limited number of training steps with the data swiftly gains a competitive performance. The code and data are publicly available at https://github.com/THUDM/CogCoM.

latexit sha1, manipulation, reasoning, (15 more...)

arXiv.org Artificial Intelligence

2402.04236

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.96)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.66)

Add feedback

The AI-Generated Child Abuse Nightmare Is Here

WIREDOct-24-2023, 23:01:00 GMT

A horrific new era of ultrarealistic, AI-generated, child sexual abuse images is now underway, experts warn. Offenders are using downloadable open source generative AI models, which can produce images, to devastating effects. The technology is being used to create hundreds of new images of children who have previously been abused. Offenders are sharing datasets of abuse images that can be used to customize AI models, and they're starting to sell monthly subscriptions to AI-generated child sexual abuse material (CSAM). The details of how the technology is being abused are included in a new, wide-ranging report released by the Internet Watch Foundation (IWF), a nonprofit based in the UK that scours and removes abuse content from the web.

abuse image, ai-generated child abuse nightmare, sexton, (14 more...)

WIRED

Country: Europe > United Kingdom (0.54)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Law (1.00)
Health & Medicine > Therapeutic Area > Pediatrics/Neonatology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.62)

Add feedback

Optimizing the AI Development Process by Providing the Best Support Environment

Khamis, Taha, Mokayed, Hamam

arXiv.org Artificial IntelligenceAug-15-2023

The purpose of this study is to investigate the development process for Artificial inelegance (AI) and machine learning (ML) applications in order to provide the best support environment. The main stages of ML are problem understanding, data management, model building, model deployment and maintenance. This project focuses on investigating the data management stage of ML development and its obstacles as it is the most important stage of machine learning development because the accuracy of the end model is relying on the kind of data fed into the model. The biggest obstacle found on this stage was the lack of sufficient data for model learning, especially in the fields where data is confidential. This project aimed to build and develop a framework for researchers and developers that can help solve the lack of sufficient data during data management stage. The framework utilizes several data augmentation techniques that can be used to generate new data from the original dataset which can improve the overall performance of the ML applications by increasing the quantity and quality of available data to feed the model with the best possible data. The framework was built using python language to perform data augmentation using deep learning advancements.

artificial intelligence, deep learning, machine learning, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.47852/bonview32021224

2305.00136

Country:

Asia (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Arizona (0.04)

Genre:

Instructional Material > Course Syllabus & Notes (0.46)
Research Report > New Finding (0.46)

Industry:

Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.67)
Information Technology > Security & Privacy (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback